AITopics | self-concordant function

Collaborating Authors

self-concordant function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Projection-Free Online Convex Optimization via Efficient Newton Iterations

Neural Information Processing SystemsApr-24-2026, 06:23:45 GMT

This paper presents new projection-free algorithms for Online Convex Optimization (OCO) over a convex domain K Rd. Classical OCO algorithms (such as Online Gradient Descent) typically need to perform Euclidean projections onto the convex set K to ensure feasibility of their iterates. Alternative algorithms, such as those based on the Frank-Wolfe method, swap potentially-expensive Euclidean projections onto Kfor linear optimization over K. However, such algorithms have a sub-optimal regret in OCO compared to projection-based algorithms. In this paper, we look at a third type of algorithms that output approximate Newton iterates using a self-concordant barrier for the set of interest. The use of a self-concordant barrier automatically ensures feasibility without the need of projections. However, the computation of the Newton iterates requires a matrix inverse, which can still be expensive. As our main contribution, we show how the stability of the Newton iterates can be leveraged to only compute the inverse Hessian a vanishing fractions of the rounds, leading to a new efficient projection-free OCO algorithm with a state-of-the-art regret bound.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

c6a79e139ec4f371701ea8cc9e06018e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 20:43:05 GMT

algorithm 1, convex, equation, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

BeyondTikhonov: FasterLearningwith Self-ConcordantLossesviaIterativeRegularization

Neural Information Processing SystemsFeb-11-2026, 18:56:29 GMT

The theory of spectral filtering is a remarkable tool to understand the statistical properties of learning with kernels. For least squares, it allows to derive various regularization schemes that yield faster convergence rates of the excess risk than with Tikhonov regularization. This is typically achieved by leveraging classical assumptions called source and capacity conditions, which characterize the difficulty of the learning task. In order to understand estimators derived from other lossfunctions,Marteau-Fereyetal.

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improved Dynamic Regret for Non-degenerate Functions

Lijun Zhang, Tianbao Yang, Jinfeng Yi, Rong Jin, Zhi-Hua Zhou

Neural Information Processing SystemsNov-21-2025, 13:01:02 GMT

Neural Information Processing Systems http://nips.cc/

dynamic regret, learner, sequence, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Iowa (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

c6a79e139ec4f371701ea8cc9e06018e-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 19:58:58 GMT

Our work draws heavily upon geometric properties of self-concordant functions, which underpin the rich theory of interior-point methods.

algorithm 1, convex, equation, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

Optimistic Interior Point Methods for Sequential Hypothesis Testing by Betting

Chen, Can, Wang, Jun-Kun

arXiv.org Artificial IntelligenceFeb-11-2025

The technique of "testing by betting" frames nonparametric sequential hypothesis testing as a multiple-round game, where a player bets on future observations that arrive in a streaming fashion, accumulates wealth that quantifies evidence against the null hypothesis, and rejects the null once the wealth exceeds a specified threshold while controlling the false positive error. Designing an online learning algorithm that achieves a small regret in the game can help rapidly accumulate the bettor's wealth, which in turn can shorten the time to reject the null hypothesis under the alternative $H_1$. However, many of the existing works employ the Online Newton Step (ONS) to update within a halved decision space to avoid a gradient explosion issue, which is potentially conservative for rapid wealth accumulation. In this paper, we introduce a novel strategy utilizing interior-point methods in optimization that allows updates across the entire interior of the decision space without the risk of gradient explosion. Our approach not only maintains strong statistical guarantees but also facilitates faster null hypothesis rejection in critical scenarios, overcoming the limitations of existing approaches.

artificial intelligence, hypothesis, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.07774

Country:

North America > United States > California > San Diego County > San Diego (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Gambling (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

Add feedback

Improved Dynamic Regret for Non-degenerate Functions

Lijun Zhang, Tianbao Yang, Jinfeng Yi, Rong Jin, Zhi-Hua Zhou

Neural Information Processing SystemsOct-4-2024, 06:36:21 GMT

Recently, there has been a growing research interest in the analysis of dynamic regret, which measures the performance of an online learner against a sequence of local minimizers. By exploiting the strong convexity, previous studies have shown that the dynamic regret can be upper bounded by the path-length of the comparator sequence. In this paper, we illustrate that the dynamic regret can be further improved by allowing the learner to query the gradient of the function multiple times, and meanwhile the strong convexity can be weakened to other non-degenerate conditions. Specifically, we introduce the squared path-length, which could be much smaller than the path-length, as a new regularity of the comparator sequence. When multiple gradients are accessible to the learner, we first demonstrate that the dynamic regret of strongly convex functions can be upper bounded by the minimum of the path-length and the squared path-length. We then extend our theoretical guarantee to functions that are semi-strongly convex or selfconcordant. To the best of our knowledge, this is the first time that semi-strong convexity and self-concordance are utilized to tighten the dynamic regret.

dynamic regret, gradient descent, sequence, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

How Well Can Transformers Emulate In-context Newton's Method?

Giannou, Angeliki, Yang, Liu, Wang, Tianhao, Papailiopoulos, Dimitris, Lee, Jason D.

arXiv.org Machine LearningMar-5-2024

Transformer-based models have demonstrated remarkable in-context learning capabilities, prompting extensive research into its underlying mechanisms. Recent studies have suggested that Transformers can implement first-order optimization algorithms for in-context learning and even second order ones for the case of linear regression. In this work, we study whether Transformers can perform higher order optimization methods, beyond the case of linear regression. We establish that linear attention Transformers with ReLU layers can approximate second order optimization algorithms for the task of logistic regression and achieve $\epsilon$ error with only a logarithmic to the error more layers. As a by-product we demonstrate the ability of even linear attention-only Transformers in implementing a single step of Newton's iteration for matrix inversion with merely two layers. These results suggest the ability of the Transformer architecture to implement complex algorithms, beyond gradient descent.

iteration, regression, transformer, (14 more...)

arXiv.org Machine Learning

2403.03183

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Fast sampling from constrained spaces using the Metropolis-adjusted Mirror Langevin Algorithm

Srinivasan, Vishwak, Wibisono, Andre, Wilson, Ashia

arXiv.org Machine LearningDec-14-2023

We propose a new method called the Metropolis-adjusted Mirror Langevin algorithm for approximate sampling from distributions whose support is a compact and convex set. This algorithm adds an accept-reject filter to the Markov chain induced by a single step of the mirror Langevin algorithm (Zhang et al., 2020), which is a basic discretisation of the mirror Langevin dynamics. Due to the inclusion of this filter, our method is unbiased relative to the target, while known discretisations of the mirror Langevin dynamics including the mirror Langevin algorithm have an asymptotic bias. We give upper bounds for the mixing time of the proposed algorithm when the potential is relatively smooth, convex, and Lipschitz with respect to a self-concordant mirror function. As a consequence of the reversibility of the Markov chain induced by the algorithm, we obtain an exponentially better dependence on the error tolerance for approximate sampling. We also present numerical experiments that corroborate our theoretical findings.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2312.08823

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(4 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Adaptive Optimization Algorithms for Machine Learning

Hanzely, Slavomír

arXiv.org Artificial IntelligenceNov-16-2023

Machine learning assumes a pivotal role in our data-driven world. The increasing scale of models and datasets necessitates quick and reliable algorithms for model training. This dissertation investigates adaptivity in machine learning optimizers. The ensuing chapters are dedicated to various facets of adaptivity, including: 1. personalization and user-specific models via personalized loss, 2. provable post-training model adaptations via meta-learning, 3. learning unknown hyperparameters in real time via hyperparameter variance reduction, 4. fast O(1/k^2) global convergence of second-order methods via stepsized Newton method regardless of the initialization and choice basis, 5. fast and scalable second-order methods via low-dimensional updates. This thesis contributes novel insights, introduces new algorithms with improved convergence guarantees, and improves analyses of popular practical algorithms.

sketch-and-project method, stochastic gradient descent, stochastic optimization, (14 more...)

arXiv.org Artificial Intelligence

2311.10203

Country: